Traversing Itemset Lattices with Statistical Metric Pruning

نویسندگان

  • Shinichi Morishita
  • Jun Sese
چکیده

ABSTRACT We study how to e ciently compute signi cant association rules according to common statistical measures such as a chi-squared value or correlation coe cient. For this purpose, one might consider to use of the Apriori algorithm, but the algorithm needs major conversion, because none of these statistical metrics are anti-monotone, and the use of higher support for reducing the search space cannot guarantee solutions in its the search space. We here present a method of estimating a tight upper bound on the statistical metric associated with any superset of an itemset, as well as the novel use of the resulting information of upper bounds to prune unproductive supersets while traversing itemset lattices. Experimental tests demonstrate the e ciency of this method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining itemset utilities from transaction databases

The rationale behind mining frequent itemsets is that only itemsets with high frequency are of interest to users. However, the practical usefulness of frequent itemsets is limited by the significance of the discovered itemsets. A frequent itemset only reflects the statistical correlation between items, and it does not reflect the semantic significance of the items. In this paper, we propose a u...

متن کامل

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

Completeness results for metrized rings and lattices

The Boolean ring $B$ of measurable subsets of the unit interval, modulo sets of measure zero, has proper radical ideals (for example, ${0})$ that are closed under the natural metric, but has no prime ideal closed under that metric; hence closed radical ideals are not, in general, intersections of closed prime ideals. Moreover, $B$ is known to be complete in its metric. Togethe...

متن کامل

Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions

The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...

متن کامل

Further Pruning for Efficient Association Rule Discovery

The Apriori algorithm’s frequent itemset approach has become the standard approach to discovering association rules. However, the computation requirements of the frequent itemset approach are infeasible for dense data and the approach is unable to discover infrequent associations. OPUS AR is an efficient algorithm for association rule discovery that does not utilize frequent itemsets and hence ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000